Mammalian MicroRNA Prediction through a Support Vector Machine Model of Sequence and Structure

نویسندگان

  • Ying Sheng
  • Pär G. Engström
  • Boris Lenhard
چکیده

BACKGROUND MicroRNAs (miRNAs) are endogenous small noncoding RNA gene products, on average 22 nt long, found in a wide variety of organisms. They play important regulatory roles by targeting mRNAs for degradation or translational repression. There are 377 known mouse miRNAs and 475 known human miRNAs in the May 2007 release of the miRBase database, the majority of which are conserved between the two species. A number of recent reports imply that it is likely that many mammalian miRNAs remain to be discovered. The possibility that there are more of them expressed at lower levels or in more specialized expression contexts calls for the exploitation of genome sequence information to accelerate their discovery. METHODOLOGY/PRINCIPAL FINDINGS In this article, we describe a computational method-mirCoS-that uses three support vector machine models sequentially to discover new miRNA candidates in mammalian genomes based on sequence, secondary structure, and conservation. mirCoS can efficiently detect the majority of known miRNAs and predicts an extensive set of hairpin structures based on human-mouse comparisons. In total, 3476 mouse candidates and 3441 human candidates were found. These hairpins are more similar to known miRNAs than to negative controls in several aspects not considered by the prediction algorithm. A significant fraction of predictions is supported by existing expression evidence. CONCLUSIONS/SIGNIFICANCE Using a novel approach, mirCoS performs comparably to or better than existing miRNA prediction methods, and contributes a significant number of new candidate miRNAs for experimental verification.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of Genetic Algorithm Based Support Vector Machine Model in Second Virial Coefficient Prediction of Pure Compounds

In this work, a Genetic Algorithm boosted Least Square Support Vector Machine model by a set of linear equations instead of a quadratic program, which is improved version of Support Vector Machine model, was used for estimation of 98 pure compounds second virial coefficient. Compounds were classified to the different groups. Finest parameters were obtained by Genetic Algorithm method ...

متن کامل

Protein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches

DNA sequence, containing all genetic traits is not a functional entity. Instead, it transfers to protein sequences by transcription and translation processes. This protein sequence takes on a 3D structure later, which is a functional unit and can manage biological interactions using the information encoded in DNA. Every life process one can figure is undertaken by proteins with specific functio...

متن کامل

A Wavelet Support Vector Machine Combination Model for Daily Suspended Sediment Forecasting

Abstract In this study, wavelet support vector machine (WSWM) model is proposed for daily suspended sediment (SS) prediction. The WSVM model is achieved by combination of two methods; discrete wavelet analysis and support vector machine (SVM). The developed model was compared with single SVM. Daily discharge (Q) and SS data from Yadkin River at Yadkin College, NC station in the USA were used. I...

متن کامل

PREDICTION OF SLOPE STABILITY STATE FOR CIRCULAR FAILURE: A HYBRID SUPPORT VECTOR MACHINE WITH HARMONY SEARCH ALGORITHM

The slope stability analysis is routinely performed by engineers to estimate the stability of river training works, road embankments, embankment dams, excavations and retaining walls. This paper presents a new approach to build a model for the prediction of slope stability state. The support vector machine (SVM) is a new machine learning method based on statistical learning theory, which can so...

متن کامل

Prediction of true critical temperature and pressure of binary hydrocarbon mixtures: A Comparison between the artificial neural networks and the support vector machine

Two main objectives have been considered in this paper: providing a good model to predict the critical temperature and pressure of binary hydrocarbon mixtures, and comparing the efficiency of the artificial neural network algorithms and the support vector regression as two commonly used soft computing methods. In order to have a fair comparison and to achieve the highest efficiency, a comprehen...

متن کامل

Mammalian Eye Gene Expression Using Support Vector Regression to Evaluate a Strategy for Detecting Human Eye Disease

Background and purpose: Machine learning is a class of modern and strong tools that can solve many important problems that nowadays humans may be faced with. Support vector regression (SVR) is a way to build a regression model which is an incredible member of the machine learning family. SVR has been proven to be an effective tool in real-value function estimation. As a supervised-learning appr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • PLoS ONE

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2007